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Abstract 

Stein's (1972) method is a very general tool for assessing the qual- 
ity of approximation of the distribution of a random element by an- 
other, often simpler, distribution. In applications of Stein's method, 
one needs to establish a Stein identity for the approximating distri- 
bution, solve the Stein equation and estimate the behaviour of the 
solutions in terms of the metrics under study. For some Stein equa- 
tions, solutions with good properties are known; for others, this is not 
the case. Barbour & Xia (1999) introduced a perturbation method 
for Poisson approximation, in which Stein identities for a large class 
of compound Poisson and translated Poisson distributions are viewed 
as perturbations of a Poisson distribution. In this paper, it is shown 
that the method can be extended to very general settings, including 
perturbations of normal, Poisson, compound Poisson, binomial and 
Poisson process approximations in terms of various metrics such as 
the Kolmogorov, Wasserstein and total variation metrics. Examples 
are provided to illustrate how the general perturbation method can 
be applied. 

Keywords: perturbation method, normal distribution, jump diffusion process, 
Poisson distribution, compound Poisson distribution, Poisson process, point 
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process, total variation norm, Kolmogorov distance, Wasserstein distance, 
local distance. 



1 Introduction 

Many applications of Stein's (1972) method, when approximating the distri- 
bution C{W) of a random element of a metric space A* by a probability 
distribution vr, are accomplished broadly as follows. The aim is to estimate 
Kh{W) — vr(/i) for each member h of a family of test functions 7i, where 
7r(/i) := J hdiT. To do this, one finds a normed space Q and an appropriate 
Stein operator ^ on ^ characterizing n; A: Q ^ J-" C M'^, for some D ?i, 
must be such that 7i{Ag) = for all g in Q, and that tt is the unique prob- 
ability distribution for which this is the case. 'Appropriate' in this context 
means that an inequality of the form 



can be established, for some (small) e. Finally, for each h E7i, find a function 
gh G Q satisfying the Stein equation 



E{{Ag){W)}\ < e\\g\\g 



g^Q 



(1.1) 



•Agh 



h-n{h). 



(1.2) 



Then it follows from (11. ip that 



Eh{W) 



Tr{h)\ < e\\gh\\g. 



(1.3) 



Hence, if it can be shown that 



gh\\g < C\\h\\jr, 



(1.4) 



for some norm || ■ ||jc- on JF, we can conclude that 



dT^{C(W) , 7i) < Cesup\\h\\jr, 



(1.5) 



where, for any two distributions P and Q on X 



dn{P,Q) ■■= sup\P{h)-Q{h)\. 



(1.6) 
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Thus, if fll.2p and fll.4l) are satisfied, it is enough for the (i->^-approximation 
of C(W) by 71 to estabhsh the inequahty (II. ip : in this sense. Stein's method 
for TT can be said to work for the distance dn- Distances of this form include 
the total variation distance (Itv^ with 7i the set of functions bounded by 1, 
and the Wasserstein distance dw, with H the Lipschitz functions with slope 
bounded by 1. 

Probabilistic inequalities of the form fll.ip can be derived by a variety 
of techniques, including Stein's exchangeable pair approach, the generator 
method and Taylor expansion. However, the analytic inequality (11.40 can 
prove to be a stumbling block, especially if a reasonably small value of C 
is desired, unless vr happens to be a particularly convenient distribution. 
For A" = M, the normal and Poisson distributions lead to simple versions 
of fll.4p . However, when introducing Stein's method for compound Poisson 
distributions, Barbour, Chen & Loh (1992) were only able to prove anal- 
ogous inequalities with satisfactory values of C for distributions for which 
the generator method was applicable, and this represents a strong restriction 
on the compound Poisson family. The class of amenable compound Poisson 
distributions was subsequently extended in Barbour & Xia (1999), where 
a perturbation technique was introduced, which enabled the good proper- 
ties of the solutions of the Poisson operator to be carried over to those of 
the Stein equations for neighbouring compound Poisson distributions. Their 
approach was taken further in Barbour & Cekanavicius (2002) and in Cekan- 
avicius (2004). Here, we show that the perturbation idea can be applied 
not just in the Poisson setting, but in great generality. One consequence is 
that the range of compound Poisson distributions whose solutions have good 
properties can be further extended, but the scope of possible apphcations is 
much wider. In particular, there is no need to restrict attention to random 
variables on the real line; distributions and random elements on quite general 
spaces can be considered. 

The perturbation method is discussed in the general terms in Section [2j 
Theorem 12.11 shows how to find the solution gh in (11. 2p for A = Ai, when Ai 
is close enough to a 'nice' Stein operator Aq, and the probability measure ttq 
associated with has supp (ttq) = the theorem also gives the inequality 
corresponding to (II. 4p . Theorem 12.41 gives conditions under which Stein's 
method works, but which do not assume the support condition, and Theo- 
rem 12.51 allows a further slight relaxation, which is particularly relevant to 
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approximation of random variables using the Kolmogorov distance. In Sec- 
tion [31 a number of specific examples are given, some of which are illustrated 
from the point of view of application in Section HI 

As indicated above, there are various ways in which an inequality (11.11) 
relevant in any particular setting may be derived. This means that the 
choice of operator Ai, and of the corresponding approximating probability 
measure vri, is frequently dictated by the problem under consideration in a 
more or less natural way. The choice of Aq is more a matter of chance. If Ai 
is not itself one of the operators for which the solutions to f 1 1.2 1) are known 
to satisfy an inequality of the form (11. 4p . then one looks for an which is, 
and which is not too far away from Ai. Such an operator need not exist. In 
order for our perturbation approach to be successful, it is necessary for the 
contraction inequality (12. 8p to be satisfied, and this limits the set of operators 
which can be considered as perturbations of any given Aq, for the purposes 
of our theorems. 

2 Formal approach 

Let X he a. Polish space, and Q a linear subspace of the functions g: X ^ M. 
equipped with a norm || ■ Suppose that ttq is a probability measure on X 
with supp {tiq) = Xq C X . Define 



:= 




^R, 7ro(|/|)<oo}; 


^0 := 




/(x) = for all X ^ Xq}; 


r : = 







and let Pq be the projection from JF onto JF^ given by 

Pof ■■= fi-Xo - 7ro(/)lA-o, 

where, here and subsequently, 1a denotes the indicator function of the set A, 
and multiplication of functions is to be understood pointwise. Now let || ■ || 
be a norm on JF, set 

T := {/G^: 11/11 <oo}, 

and define := J-T) JFg, := f] JF', JFq := JF n JF^; we shall require that 
II ■ II is such that 

Po:T^%. (2.1) 
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We also assume that JF is a determining class of functions for probability 
measures on X (Billingsley 1968, p. 15). 

We now suppose that there is a 'nice' Stein operator characterizing ttq. 
By this, we mean that 

Ao-.g^j^;,, (2.2) 

and also that it is possible to define a right inverse 

^0 ^ ■ ^0 ^ So ■■= {geG: gix) = for all x ^ Xq], 

satisfying 

A(A"V) = / for all /e^o; (2-3) 

Mo'^o/lle < ^11/11, feT, (2.4) 

for some A < oo. Note that (12. 2p means that 

MA)9) = for all gEQ. (2.5) 

On the other hand, in view of (12.31) . if vr is any probability measure on Xq 
such that 7r(^ofi') = for all g E G, then vr(/) = for all / G T'q, meaning 
that vr(/) = 7ro(/) for all / G J-'o, and hence for all ] E T . Since JF is a 
determining class, vr = ttq, and characterizes ttq through (12. 5p . 

In the setting of the introduction, for /i G C JFg a family of test 
functions, we have h{x) — iiQ^h) = {Poh){x) for x G Xq, so that we can take 

= Ao^Poh and obtain i^^, in view of fl23D . Inequality fl2^ is just fOD 
for with / in place of h. Hence, because of (II. 5p . Stein's method for ttq 
based on (11.11) (with in place of A) works for distances based on families Ti 
of test functions whose norms are uniformly bounded. Our interest here is 
in extending this to probability measures vti characterized by generators Ai 
which are close to ^o- 

So let vTi be a finite signed measure on X with 7ri{X) = 1, and such that 
< ^ fo^ / ^ ^- Let Ai be a Stein operator for tti, meaning 
that Ai'. Q ^ J^'ii where 

r[ := {f:X^^- |7ri|(|/|)<oo, 7ri(/) = 0}, 

so that 

TTi{Aig) = for all g e Q; (2.6) 
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set U = ^1 — and assume also that 

UAq^Pq-.T (2.7) 

The key assumption which ensures that Ai can fruitfully be thought of as a 
perturbation of is that 

\\UA,^P,\\ =: 7 < 1- (2.8) 

Remark. Having to satisfy the condition fl2.8p significantly limits the choice 
of distributions tti whose Stein equations can be treated as perturbations of 
that for ttq. This is clearly illustrated in the examples of the next section. 

Theorem 2.1 With the above definitions, suppose that assumptions ^2.1\) - 
^2.4^ and \2.&\) - [2^] are satisfied. Then the operator 

B := A,^PoY,i-^yi^A'Poy: y ^ Qo (2.9) 

i>o 

is well defined, and 

\\B\\ < A/(l-7); \\UB\\ < 7/(1-7). (2.10) 

Furthermore, for f E and for all x E Xq, 

{A,Bf){x)-{P^f){x) = c{f) = 7ri(/)-7ro(/) + 7ro(WS/), (2.11) 

where Pif = f — 7ri(/)l; here, 1 = Ix- In particular, if = X , we have 
c{f) = 0, so that B is a right inverse of Ai on J-'[ fl JF. 

Proof. The first part is immediate from fl2.4p and fl2.8p . from the properties 
of Aq^ and from (12.70 . It is then also immediate that 

(A + PoU)Bf = Pof, feT. 

Hence, for f E J-", we have 

AiBf = {Ao + P,U + {I-P,)U)Bf 

= Pof + {UBf)lxs+7ro{UBf)lxo, (2.12) 
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so that, for x G Xq, 



(AS/)(x)-(Pi/)(x) = m{f)-no{f) + no{UBf) =: c(/). (2.13) 

For the constant c(/), note that, from (12. 6p with Bf for g and from (I2.12p . 
we have 

= 7ri(/l;,J-7ri(A'o)7ro(/) + 7ri((WS/)l;,c) + 7ro(WS/)7ri(A'o) 
= vri(/) - vri(/l;,c) - TToif) + 7ri(A'o^)7ro(/) + 7ri((Wi3/ 
+ 7ro(Wi5/)(l- 7^(^-0^)). 

This imphes, from the first part of the theorem, that 

c(/) = Mf)-noif) + Ml{Bf) = 

if = 0, and 

c(/) = 7ri(/l;,c)-7ri(A'o^)7ro(/)-7ri((Wi3/)l;,c) + 7ro(WS/)7ri(A'o=) (2.14) 

otherwise. □ 

Remark, li = X, then it follows from Theorem 12. II that 

AiBh = Pih = h~-Ki{h) 

for test functions h & Ti <Z T . Hence, for such /i, the function := Bh 
satisfies (11.21) . where A is replaced by Ai and tt by tti. It then follows 
from (12.101) that \\gh\\g < ^(1 - 7)~"^||/i||, so that (fL4l) is satisfied with 
C = A/{1 — 7), and hence Stein's method for vri based on (II. ip (with Ai 
for A) works for distances d-j-^ derived from bounded families of test functions. 

\i ^ X , the inequalities (12.101) are still satisfied, so that (ll.4p is still 
true with C = y4/(l — 7) if gh = Bh. However, this choice of gh now gives 
only an approximate solution to (11.21) : 

{AigH){x) = h{x) - Tii{h) + c(/i), X e X^. (2.15) 

This is still enough to show that Stein's method works for vri based on (11.11) 
(with Ai for A) , as is demonstrated in Theorem 12.41 below. To make the 
connection, we first need two more lemmas. 
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The first concerns the size of |c(/)|. This can be controlled in a number 
of ways, two of which are given in the following lemma. For any finite signed 
measure vr and any A G X, we define 

K{7r,A) := _sup |7r|(/l^), (2.16) 

{/e^: II/II<1} 

where f{x) := \ f{x) -7ro(/)|. 
Lemma 2.2 For f G JF, we have 



(^) \c{f)\ < 



1-7 



1-7 

Proof. The proof is immediate from (12.141) and (12.161) . □ 

The second lemma translates (12.151) into an inequality bounding the differ- 
ence |7r(/) — 7ri(/)| in terms of |7r(^ii3/) |, for a general probability measure tt 
on X. 

Lemma 2.3 Under the conditions of Theorem \2.1\ if tt is any probability 
measure on X , then, for any f G T , we have 



Hf) - 7^^if)\ < \niA^Bf)\ + 



2(i-7)-Hkil('^'o') + ^ra} 

(l-7)"H«^K,A'o^) + «:(vr,A'o^)} 



Proof. It follows from (^J^ that 

niA^Bf) = 7r(Po/) + 7r((Wi3/)l;,c) + vro(Wi3/)7r(A'o) 
= nif) - nifl;,.) ~ Mf)il - niX^)) 

+ n{(UBf)l;,c) + 7ro(WS/)(l - n{XS)) 
= {vr(/)-vri(/)} + c(/)-7r(/l;,c) 

+ (7ro(/) - MUBf))n{X^) + 7r((WS/)l^c). 

Hence, and using (I2.16p . the lemma follows. □ 
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This lemma gives the information that we need, when deriving distri- 
butional approximations in terms of the measure vti. Let ?i C JF be any 
collection of test functions which forms a determining class for probability 
measures on X. Then define the metric d-j-c on finite signed measures p, a 
on X, by 

dn{p,a) := snp\p{h)-a{h)\. (2.17) 

In the special case where TC := {f E T\ ||/|| < 1}, we write dy for d-yi. The 
following theorem shows that Stein's method for vri based on ( ]2.18p works 
for the distance dy^ even when X. 

Theorem 2.4 Suppose that the conditions of Theorem \2.1\ are satisfied, and 
write := Aq^Pq/ for all f E J-". Then, if 



TT 



{A,gj)\ < e\\g'f\\g for all f e J', (2.18) 



it follows that 



dyin.TXi) < il--fy^{Ae + e'{n,Tri)}, 

where 

e\n,n,) := mm{2{\m\{X,-) + n{XS))F, X^) + k{71, X^)}, 

and 

F:= _sup ll/IU- (2.19) 

{f<^^-- ll/ll<i} 

Proof. In fact, let / = Ei>o(-l)-'(^A^^o)-'7- Then I HAM together 
with ([21]) and ([23]) imply that ~ 

Af 

\n{ArBf)\ = \7r{Arg})\ < s\\g}\\g < Ae\\f\\ < 
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Thus the conclusion follows immediately from Lemma 12.31 and from the def- 
inition of dy. □ 

Note that (12.181) is a weakening of what would normally be required for (II. ip . 
inasmuch as the inequality is only needed for the functions g^, which, being 
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the solutions to the Stein equation for the 'nice' operator Aq, may well be 
known in advance to have good properties. 

Theorem 12.41 is applied most simply when tt is the distribution of some 
random element W, for which it can be shown that 

mAig){W)\ < Y.^,c,{g), geQ. (2.20) 

Here, the quantities Sj are to be computed using W alone, and the function g 
enters only through the constants Cj{g). If the norm || ■ || on JF can be chosen 
in such a way that the Cj{g'j) can be bounded by a multiple of ||/|| for any 
f E J-", then Theorem 12.41 can be invoked. 

The choice of norms on JF for which this procedure can be carried through 
depends very much on the structure of the random variable W: see Sec- 
tion m Broadly speaking, for the more stringent norms, the contraction 
condition (12. 8p is harder to satisfy; on the other hand, there are then fewer 
functions having finite norm, and so the inequality (12.181) is easier to estab- 
lish. Take, for example, standard normal approximation, with Q the space of 
bounded real functions with bounded first and second derivatives, endowed 
with the norm 

WaWg ■= llfi-lloo + llfi-'lloo + ll/lloo , (2.21) 
and with the Stein operator given by 

{Aog)ix) = g'{x) - xg{x), geQ. (2.22) 

Here, it is possible, in many central limit settings, to derive an inequality of 
the form (II. ip : 

mAog){W)\ < 49\\g 

for some e, as, for example, in Chen & Shao (2005, p. 5). Now, for g'j = 
^o'n/with llf lU < oo, we have ||(^7^)"||oo < 4||f|U by Proposition[5l](c)(i) 
and (iii) with y = g^, so that inequality (11.40 is satisfied with 

11/11^^^ := ll/lloo+ll/'lloo (2.23) 

as norm on JF. This, in turn, leads to corresponding approximations with 
respect to the distance d-^ =: from ( 11.50 . 



10 



In the usual central limit context, there is typically no hope of taking the 
argument further, and choosing 7i = JF for the supremum norm || ■ ||oo in place 
of II • II on T . This is not because the perturbation argument would fail, but 
because there can usually be no inequality of the form |E(Po/)(W^)| < ^ll/lloo 
for all f E unless e is rather large; this is because the supremum of the 
left hand side is then just the total variation distance between C{W) and 
the standard normal distribution, and this is not necessarily small under the 
usual conditions for the central limit theorem. More is, however, possible 
with some extra restrictions: see CacouUos et al. (1994) and Example Hll. 

The distance (i*-^-* is not the one most commonly used for measuring the 
accuracy of approximation in the central limit theorem. Here, it is usual to 
work with the Kolmogorov distance dx, which is of the form defined in fl2.17p . 
with the set of test functions 

:= {l(_oc,a]: aeM}. 

For these test functions, it can in many central limit applications be estab- 
lished, albeit with rather more effort, that |E(^o5'/i)(W^)| is bounded, uni- 
formly for h G Ti^ , by a quantity of the form ke for some k < oo and e 
reflecting the closeness of C{W) and the standard normal distribution. This 
in turn, with (11.21) . implies error estimates for standard normal approxima- 
tion, measured with respect to Kolmogorov distance. 

Now the set Ti.^ forms a subset of JF, when the supremum norm is taken 
on !F, and the perturbation arguments leading to Lemma 12.31 can still be 
applied successfully, for Stein operators Ai suitably close to Aq. However, 
in order to deduce distance estimates as in Theorem \2A\ it is necessary to 
be able to bound |E(^ofl'/)(W^)| not only for / G , but also for any / of 
the form / := {UAQ^Po)-'h, where h G Ti.^ and j > 1, since these functions 
are used to make up the function / introduced in the proof of Theorem 12.41 
Now these functions / are not typically in the set Ti^ . However, it can at 
least be shown that both g'^ and ((7°)' are uniformly bounded for h G Ti.^ . 
For some operators Ai, this is enough to be able to conclude that 

sup ||W(7h||^^'' < oo. 

It is then possible to apply the following result, in which the Stein operator 
is now quite general. 
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Theorem 2.5 Suppose that the conditions of Theorem \2.1\ are satisfied, and 
that Ti is any family of test functions with H := sup^g-^^ ll^lloo < oo, and 
such that := A^^Poh is well defined for h E Ti., satisfying Aog^ = PqH and 
Ug^ G T . Assume further that 

■■= H-'sup\\Ug1\\ < oo. (2.24) 
hen 

Then, if ir is such that 

sup|7r(A^7°)| < He, (2.25) 
hen 

and 

HAig'})\ < e^Wg'fWg, f eT, (2.26) 

it follows that 

7hv4£:2 , e(7r,7ri) 



c?w(7r,vri) < Hie 



-1 



1 — 7 1 — 7 
where e{7i, vti) := ^(vri, Xq) + ^(vr, Xq). 



Proof. Once again, much as in the proof of Theorem 12. 4[ we note that 
\7r{ArBh)\ < J2HArA^'Po{UAo'Poyh)\ = |7r(A^7D| + ^ |7r(A4)|, 

(2.27) 

where fj := {UAq^PoY h, j>l. Now, for h G , 

II /ill = \\Ug^,\\ < H^H. 
by (1221, and then, by fl^ . 

II/, II < 7''"'^^7//, J>2. 
Hence, from fl2:27D . ([MD, (12:251) and ^TM) . it foUows that 
sup \Tx{AiBh)\ < Hei + y^ e2A-f^-^H-fH, 

and the theorem now follows from Lemma 12. 3[ □ 
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In particular, if Aq is the Stein operator for normal approximation given 
in fl2.22p . and taking the norm || ■ H*^^-', Theorem 12.51 can be applied with 
Ti. = Ti.^] in circumstances in which the conditions fl2.24p - fl2.26p are satisfied, 
this leads to estimates of the error in approximating the distribution tt of a 
random variable W hj tti, measured with respect to Kolmogorov distance. In 
particular, the estimates (12.250 and fl2.26p relating to the distribution of W 
are of a kind which can often be verified in practice; see Section HI 

3 Examples 

In the first two examples, the sets and X are the same, so that the elements 
in the bounds involving probabilities of the set make no contribution. The 
first of these is purely for illustration, since properties of the Stein equation 
for the perturbed distribution could be obtained directly. 

Example [31.1. In this example, we consider approximation by the probability 
distribution tti := tm,^ on M, with density 



where km,i> is an appropriate normalizing constant. This family of densities 
interpolates between the standard normal {ip = 0) and Student's distribu- 
tion [tp = 1) distribution, as ip moves from to 1; m is classically a positive 
integer. We take for Q the space of bounded real functions with bounded 
derivatives, endowed with the norm 



{A^g){x) = g\x) - x (1 - ^) + nill^ ^(3;), g ^ g- (3.1) 

m -\- nf"^ 



this follows because Pm,i!{,x) is an integrating factor for the right hand side 
of (13. ip . and hence, for any g E Q, 



xe X ■.= R. 



9\\g ■= \\9\\oc 



+ h'Woo ■ 



An appropriate Stein operator Ai for tm,ip is given by 
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so that fl2.6l) is satisfied. Now, at least for small enough ip, Ai could be 
thought of as a perturbation of the standard normal distribution, with Stein 
operator 

{Aog){x) = g'{x)-xg{x), geQ, 

discussed above, whose properties are well documented: see, for example, 
Chen & Shao (2005, Lemmas 2.2 and 2.3). Rather than take the stan- 
dard normal for ttq, we actually prefer to perturb from a normal distribution 
J\f{0, (1 — 'ip)'^)- This has Stein operator 

{Aog)ix) = g'{x) - (1 - i^)xg{x), geG, (3.2) 

which gives 

{Ug){x) = -x ^^ X , geg. 

The properties of Aq^ are as given in Proposition 15. ![ with y replaced by g. 
For the supremum norm on JF, we find that assumptions fl2.1l) - fl2.4p and 
(I2.6p - (l2.7p are satisfied, and that 

sup|x(A"'i^o/)(x)| < 2(l-^)-i/||; 

X 

WUAo'PoW < 2^|J{l-^P)-\l + ^) =: 7, 

from Proposition I5.1l (b)(iii). Condition 02. 8p is satisfied if 7 < 1, in which 
case Theorem 12.41 shows that Stein's method works. 

Note, however, that Student's tm distribution itself is too far from the 
normal for this perturbation argument to be applied, since then ip = 1, and 

so 7 = CXD. 

For bounded functions / with bounded derivative, it follows from Propo- 
sition [5?T](c)(iv) that 

snp\x{A^'Pofy{x)\ < -A^II/'IU. 

X 1-tp 

This translates into a bound for ||W^o ^Pq/II*'^'', and (12. 8p is then satisfied for 
all ip small enough. As for normal approximation, bounding E{(.4o(y')(VI^)} 
by a linear combination of 11(71100, Iklloo and ||5'"||oo may be a much more 
reasonable prospect than using only \\g\\oo and H^/'lloo, and these quantities 
are themselves all bounded by multiples of ||/||''^'*, for g = Aq^Pq/ and 
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/ G J^^ := {/ G J^: ||/||(^) < oo}: see Proposition 0(c) (i)-(iii), with 
y = g. In such cases, ci'^^^-approximation is a consequence. 

To deduce Kolmogorov distance using Theorem I2.5[ note that, for 7i = 

7w = sup WUglW^'^ 

from Proposition 15. 11 (a) (i)-(iii). If an approximation with respect to can 
be obtained from Theorem 12. 4[ then the estimate used in fl2.18p can be used 
also in (12.261) . and the main further obstacle is thus to verify condition fl2.25p . 

Example [3l2. Our second example also concerns a perturbation of the nor- 
mal distribution, but now to a distribution tti, whose Stein operator is not so 
easy to handle directly. This time, we take for Q the space of real functions g 
with g{0) = and having bounded first and second derivatives, endowed 
with the norm 

llfl'lle := yWoc + \\g"\\oo- 

As Stein operator ^i, we fix a > and take the expression 

{Aig){x) = g"{x)-xg'{x) + a{g{x + z)-g{x)}, geG, (3.3) 

which can be viewed as a perturbation of the Stein operator 

{Aog){x) = g"{x) - xg'{x), geG, 

characterizing the standard normal distribution. This operator is equivalent 
to that given in (12.221) . and the properties of ^ are given in Proposition 15. H 
with y = g' and ^ = 0. The distribution vti is that of the equilibrium of a 
jump-diffusion process X, with unit infinitesimal variance, and having jumps 
of size z at rate a. 

Once again, taking the supremum norm on JF, it is easy to check that 
assumptions (I2.ip - (I2.4I) and (I2.6p - (I2.7I) are satisfied, and since 

||(^0-lPo/)'||oo < V^ll/lloo, 
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from Proposition 15. 11 (b) (i). it follows that 



llW^o^^oll < ^za. (3.4) 

Hence, from Theorem 12.41 Stein's method works for tti if 7 = \p2/Kza < 1; 
an estimate of the form fl2.18p is all that is needed. 

As above, the supremum norm may be more difficult to exploit in practice 
than the norm || ■ H*^^). Here, for / G JF, and writing = Aq^Pq/, we have 

mg^fYix)] < a f \{g'f)"{x + t)\dt < Aaz\\f\U 
Jo 

from Proposition 15. 11 (b) (ii). and Theorem 12.41 can be applied if a is small 
enough that 7 = (4 + \/2TT)za < 1. 

For Kolmogorov approximation, note that, for Ti. = 7i^, 

7w = sup WUg^W^'^ < (l + v^/4)m, 

by Proposition 15. 11 (a) (i)-(ii). Once again, the main effort in addition to S^^- 
approximation is to verify (12.251) of Theorem 12. 5[ 

Note that we are also free to perturb from other normal distributions. If 
we choose to centre at the mean az of vri, we can do so by writing 

iAig){x) = g"{x) - {x - az)g'{x) + a{g{x + z) - g{x) - zg'{x)}, geG, 

with the first two terms the Stein operator for the normal distribution Af{az, 1). 
The third, perturbation term can be bounded by 2az^||/||oo, and its deriva- 
tive by a2;^||/'||oo (Proposition 15. 11 (b) (ii)-(ni)). enabling (12. 8p to be satisfied 
for II /IP^ for a larger range of a, if z is small enough. It is also possible to 
begin with M{az, 1 + az^/2), correcting for both mean and variance. 

It is also possible to generalize the class of perturbed measures by re- 
placing the term a{g{x + z) — g{x)) corresponding to Poisson jumps of 
rate a and magnitude 2; by a more general Levy process, taking instead 
/ {g{x + z) — g{x)} a{dz), for a suitable measure a. 

Example [3l3. As our third example, considered already in Barbour & Xia 
(1999) and in Barbour & Cekanavicius (2002), we consider (signed) com- 
pound Poisson distributions tti on Z, the set of all integers, as perturbations 
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of Poisson distributions on Z4. := {0, 1,2, ■■■}. We begin with vti as the 
compound Poisson distribution CP (A, /i) on Z+, the distribution of ^;>i INi, 
where Ni, N2, . . . are independent, and A''; ~ Po (A/i;); mi := ^;>i Ifii is as- 
sumed to be finite. In this case, we have X = Xq = Z+. With Q the space of 
bounded functions g: IN — *■ M, endowed with the supremum norm, a suitable 
Stein operator for vri is given by 

{A,g){j) = \Y,lm{j + l)-J9{j). J>0, (3.5) 
i>i 

considered as a perturbation of the Stein operator 

= Ami(7(j + l)-j(7(j), J>0; (3.6) 

this means that 

{Ugm = \Y,llii{g{3 + l)-g{3 + l)}, j>0. (3.7) 
i>i 

Taking the supremum norm on JF, assumptions f l2.ip - fl2.4l) and fl2.6l) - fl2.7l) 
are satisfied; and since, from the well-known properties of the solution of the 
Stein Poisson equation, 

||A(A-'n/)lloo < ^ll/lloo, (3.8) 
where Ag{j) := g{j + 1) — g{j), it follows that 

W^Ao^PoW < 2A^/(/- l)/i;/(Ami) = 2m2/mi, 

where m2 = J2i>iK^ ~ Hence (12.81) is satisfied if 777,2/^1 < 1/2, and 

Theorem 12.41 can then be invoked. Note that, in this setting, it is reasonable 
to work in terms of the supremum norm, since total variation approximation 
may genuinely be accurate. 

There are nonetheless other distances that are useful. Two such are the 
Wasserstein distance dw, defined for measures P and Q on Z by 

dw{P,Q) := sup \P{f)-Q{f)\, 
/eLipi 
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where Lip^^ ■= {f '■ Z — > ||A/||oo < 1}, and the point metric dpt defined by 
rfpt(P,Q) := max|P{j}-g{j}|, 

which has apphcation when proving local limit theorems. 

For Wasserstein distance, it is natural to begin with the semi-norm |J/|| := 
\\f\\w '■= ||A/||oo on JF, which becomes a norm when restricted to JF . The 
arguments in Section [2] go through in this modified setting very much as 
before; the only practical differences are that one needs to check that PqUAq^ 
maps J-'q into itself, and to replace the condition (12.81) by 

7 := WPqUAo^W < 1. (3.9) 

For the Poisson operator Aq given in (13.61) . it is known that 

lk?||oo < ||Po/|k = ll/lk; IIA^^JIU < 1.15(Ami)~i/l/||H/; 
||A'^;iloo < 2(AmO-i/|k, (3.10) 

whenever f E J-' and := Aq^Pq/ [Barbour and Xia (2005)]. Hence, for Ai 
as m (1331) and f e J^q, it follows from dSTD that 

WPd^g^fWw = ||APoW(7j||oo < A^/(/-l)/i,||A2^;|U < 2(m2/mi)l|/|k, 

i>i 

so that PqUAq^ indeed maps JFq into itself, and 7 = ||PoW-^o^|| < 2m2/mi. 
Thus (13.91) is satisfied for 7712/ rrii < 1/2, and the perturbation approach can 
then be invoked. 

For the point metric, we take the /i-norm ||/|| := := Xljgz 1/0)1 

on JF. For / G JFg and = A^^Pof, we have 

||^7?||oo < (Ami)"^||/||i; IIA^^JIK = < 2(AmO-^||/||i; 

(3.11) 

both inequalities are consequences of the proof of the second inequality in 
Barbour, Hoist & Janson (1992, Lemma 1.1.1). Hence, from (13. 7p . it follows 
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immediately that 



1>1 j>0 s=l 

< 2Am2(Ami)-i/||i = 2(m2/mi)||/||i, 

so that condition (12.81) is once again satisfied if 7712/ mi < 1/2. 

If, more generally, vti is a (signed) compound measure on Z, with charac- 
teristic function 

expiAj^/iKe*''- 

similar considerations can be applied. Here, we now have = Z, but Xq 
is still Z+. The corresponding Stein operator is formally exactly as in (13. 5p . 
except that the /-sum now runs over the whole of Z, and we require mi to be 
positive; also, the role of m2 is now played by m'2 = Yli^iK^ ~ When 
applying Lemma [2.31 and Theorem 12.41 we have the inequalities 

K(7r,Z_) < 2|7r|(Z„) 

for use with (Itvi 

K(vr,Z_) < 5^k|{j}(|j| + A) 

for dw, and, with the fact that maxj ttq (j) < (2eA)-i/2 [Barbour, Hoist & 
Janson (1992, p. 262)], 

fi;(7r,Z_) < |7r|(Z_) +max|7r|{/} 

~ y2iA «<o ' ^ 

for rfpt. 

Example [3l4. In this example, the setting is similar to that in the preced- 
ing example, but we now consider a compound Poisson distribution vti = 
CP(A^,yU^) on Z4. as a perturbation not of a Poisson distribution, but of 
another compound Poisson distribution ttq = CP (A°, /i°) on Z+. The reason 
for doing so is that the solutions to the Stein equation are known to be well 
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behaved only for rather restricted classes of compound Poisson distributions: 
see Barbour & Utev (1998), Barbour & Xia (2000). The perturbation method 
offers the possibility of expanding the class of those with good behaviour by 
including neighbourhoods not only of the Poisson distributions, but also of 
any other compound Poisson distributions whose Stein solutions can be con- 
trolled. In particular, we shall suppose that the distribution ttq = CP (A°, /x") 
is such that 

> (j + l)/i°+i, j > 1, 

and that S := fii — 2fi2 > 0, these conditions implying that, with Ci{X^) = 
4 - 2(5A0)-i/2 and C2(A0) = K^A^)-! + 2 log+(2(5A0)), 

lb?||oo < {5A0}-i/2ci(A0)||/|U and HA^^JIU < {SXT'c2{\')\\f\U 

(3.12) 

where, as usual, gj := Aq ^Pof; see Barbour, Chen & Loh (1992, pp. 1854- 
5). Here, the Stein operators and Ai are given as in (13. Sp . with the 
corresponding choices of A and fi, giving 

{Ug){j) = 5^/{AV?-AVl'}^7(j + /), J>0. 

As in the previous example, we shall only consider perturbations which pre- 
serve the mean, so that also 

Taking the supremum norm on JF, assumptions (I2.ip - (l2.4p and fl2.6p - fl2.7l) 
are satisfied. In order to express the contraction condition (12.80 . write 

E := i5^/|AW-AV?|, 

i>i 

and define probability measures p and a on IN by 

set 9 := Ed]v{p,o'), where dw denotes the Wasserstein distance. Then, 
using (13.120 . it follows easily that 

\\l(A,'Po\\oo < {SX^'r'c^iXy =: 7, 
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with ([23]) satisfied if 7 < 1. 



Example [3].5. In our last example, we consider solving the Stein equation 
for a point process, whose distribution vti is close to that of a spatial Poisson 
process. Let X be a compact metric space, and let X denote the space of 
Radon measures (point configurations) on X. Then a Poisson process on X 
with intensity measure A satisfying A := A(X) < 00 is a random element 
^iLi^Xi of X, where N, Xi, X2, . . . are all independent, ~ Po ('^) 
Xi ~ A~^A for / > 1, and 6^ denotes the unit mass at x. Its distribution ttq 
can be characterized by the fact that vro(^ofi') = for all g in 

g := {g: X ^ R; gi^h) = 0, \\Ag\\^ < 00}, 

where 

(Aogm ■■= [ m^ + S.)-giO)Hdx) + igi^-S,,)-gimidx)}, 
Jx. 

and ||A(7||oo '■= ^'^P^ex,x£ii\9i^~^^x) — g{0\- Note that the Stein operator ^0 
is the generator of a spatial immigration-death process, with ttq as its equilib- 
rium distribution. For the measure tti, we take the equilibrium distribution 
of another spatial immigration-death process on X, with generator Ai given 
by 

{A,gm ■■= I mi + 5.)-g{i))Ui.dx) + {g{i-5,)-g{imdx)}- 
Jx. 

here, the immigration measure is allowed to depend on the current configu- 
ration C,. We can write Ai = Aq +U if we set 

(Ugm ■■= [ igi^ + 5,)-gimU^,dx)-Aidx)), 
Jx 

and we note that Xq := supp (vto) = X. 

We begin by considering perturbations appropriate for total variation 
approximation, taking the set of functions J^: X ^ M. with the supremum 
norm || ■ ||oo- Then, as in Barbour & Brown (1992, pp. 12-13), it is possible 
to define a right inverse Aq^ satisfying ([23]) and ([231), with A = 2. To check 
that UAq^Pq : — > we combine the definition of U and (12.41) to give 

mgjmi < 2ic(x)ii/iu, 



21 



where L^{-) denotes the absolute difference between the measures Ai(^, .) 
and A(-); hence we shall need in addition to assume that 

A := 2supI^(X) < oo, (3.13) 

in order to make progress. If we do, then (12. 8p is satisfied with 7 = A if 
A < 1, and Theorem 12.41 can be used to show that Stein's method works. 

Total variation is often too strong a metric for comparing point pro- 
cess distributions, and so an alternative metric d2 is proposed in Barbour 
& Brown (1992), based on test functions Lipschitz with respect to a metric 
on X which is bounded by 1. Similar calculations can be carried out in this 
setting also; the condition needed to satisfy (12. 8p is somewhat more stringent. 

Even the contraction condition A < 1 is rather restrictive. Consider 
a hard-core model, in which X C M"^ has volume A{dx) = dx, and 
A{^,dx) = I[^{B{x,e)) = 0]dx, where I[C] denotes the indicator of the 
event C; this specification of A(^, dx) is such that no immigration is allowed 
within distance e of a point of the current configuration ^. Then X = 'd, and 
contraction is only achieved if the expected number 'd of points under ttq is 
less than 1. However, one could also consider a model with tti the equilibrium 
distribution of a slightly different immigration death process, in which 

A{^,dx) = max{I[^{B{x,e)) = 0],I[^{'X)>2^]}dx; 

for large the difference between the equilibrium distributions of the two 
processes is small, but, for the new process, A < 2i)a{e), where a{e) is the 
area of the e-ball, meaning that models with much larger expected numbers 
of points can still satisfy the contraction condition. Nonetheless, these are 
still models in which, at distance e, little interaction can be expected; the 
mean number of pairs of points closer than e to one another under ttq is 
about ^i)a{e), and, if the contraction condition is satisfied, this has to be less 
than |. 

4 Illustrations 

In this section, we illustrate how the perturbations described above can be 
used in specific examples. 
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Example In our first illustration, we return to the setting and notation 
of Example [312, and consider approximation by the distribution tti whose 
Stein operator is given in (13. 3p above. The distribution we wish to approxi- 
mate is the equilibrium distribution tt of another jump-diffusion process, in 
which the jumps do not have fixed size 2;, but are randomly chosen with z as 
mean; this process has generator A given by 

{Ag){x) = g"{x) - xg'{x) +a j {g{x + C) - g{x)}^l{dC), geQ. (4.1) 

This distribution can be expected to be close to vri provided that the proba- 
bility distribution /x is concentrated about z, and since the distribution vr is 
reasonably well understood, such an approximation may constitute a useful 
simplification. 

The main step is thus to establish a bound of the form (12.181) . after which 
Theorem 12.41 can be applied. However, for X ~ tt and g'j := Aq^Pq/, we 
immediately have 
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(A^gj) = 7r{Ag'})-aEy{g%X + C)-gj{X + z)}f,{dO 
= -aEy{g%X + 0-g%X + z)}f,{dO 



since 7^{Ag) = for aA\ g G Q. From this it follows by the mean value theorem 
that 

\7r{Aigj)\ < |«y"(C-^)V(rfC)ll(^??riloo < 2a j{C-zfm)\\f\\o.. 

This suggests that the supremum norm on JF is an appropriate choice, and 
from Theorem 12.41 if 7 := \/2t[ za < 1, as in (13.41) . it follows that 

ciTy(vr,7ri) < 2a J {( - zf fi{dO / {1 - -f) . 

Thus the total variation distance between the two distributions is small if 
the variance of fi is small (and 7 < 1). 

Example |4l.2. We continue with the setting and notation of Example [3l2, 
and again approximate by the distribution vri. Here, as the measure tt, we 
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take the equilibrium distribution of a Markov jump process W^, defined as 
follows. We let be the pure jump Markov process on Z_|. with transition 
rates given by 

j — > j + 1 at rate A^; j ^ j — 1 at rate j; 
j — > j + [zVn\ at rate a, 

and we then set WN{t) := {X]\r(t) — N}/^/N. If 2; = 0, the equilibrium 
distribution of X^r is the Poisson distribution with mean N, and that of Wjsi 
the centred and normalized Poisson distribution, which is itself, for large A^, 
close to the normal in Kolmogorov distance, but not in total variation. Here, 
we wish to find bounds for the accuracy of approximation by tti when z > 0. 
As above, we need a bound of the form fl2.18p . so as to be able to apply 
Theorem [231 

Much as above, we begin by observing that ir^Ag) = for all g & Q, 
where now, writing wjn '■= (j — N)/\/N and rj^ '■= l/y/N, we have 

{Ag){wjN) = N{g{wjN + Vn) - giwjjv)} 

+ j{9iwjN - Vn) - giwjN)} + a{giwjN + [zVN\riN) - g{wjN)}- 

Subtracting {Aig){wjN) and using Taylor's expansion, it follows that 
\{Ag){wjN) - {Aig){wjN)\ 

< iV-'/'(|||/'||oc + |k,w|||/||oo + «||^?'||oo), 

so that 

HAig)\ < N-'/\l\\g"'\\^ + lE\WN\\\g"\\oo + a\\g'\U. (4.2) 

Note that, taking g{w) = w and g{w) = w"^ respectively in n{Ag) = 0, as we 
may, by Hamza & Klebaner (1995, Theorem 2), it follows that lEl^^rl < ct^ 
and 

2E{W^} < 2Nril + riN\EWN\ + 2az\EWN\+az^ < 2 + azr]N + 2a^z'^ + az'^ , 
which implies that 

{E|H^jv|}' < < i^iaz7]N + a^z^ + laz^; 
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thus E|iy/v| is uniformly bounded in A^. Furthermore, for g = g'j := Aq ^Pof 

and / G T^^^ , we can control the first three derivatives of by using Propo- 
sition [5]T] with y = {g^f)' 1 so that (14.21) yields a bound of the form 

|7r(A^7?)| < CiV-V2||;||(i), 
for all / e r '. In view of Theorem \2A\ this translates into the bound 

c/W(7r,7ri) < CiV-i/V(l-7) 

if 7 < 1, where now, for || ■ \\^^\ we have 7 = (4 + \/2Ti)za, as in Example [312. 

If, instead, Kolmogorov distance is of interest, then the only obstacle is 
to verify (12.251) of Theorem 12. 5[ For g = g^, the estimate given in (14. 2 p is 
fine, except for the first term: it is no longer possible to bound the difference 

DNiw) := N{g{w + r]N) - giw) + g{w - r]N)) - giw)} - g"{w) 

by |?7Ar||5''"||oo, since, for h = ha = l(_oo,a], g"'{ci) is not defined. However, it 
is clear that \D]\r{w)\ < 2||5f"||oo for all w, and that, for \w — a\ > rj^, 

\DNiw)\ < sup \g"'{x)\. 

\x—w\<riff 

Now, for h = ha, taking a > without real loss of generality, we have 

\9"'{x)\ < Ci + C2ae-'^('^-^)l(o,a)(x), x ^ a, (4.3) 

for universal constants Ci and C2, so that g'" is well behaved except just 
below a. The bound (14. 3 p can then be combined with the concentration 
inequality 

F[WNe[a,b]] < {^{b-a)+riN}{E\WN\+az), 

obtained by taking g" = l[a-nN,b+m] 9'{w) = Jl^_a)/2 9"{t) dt for any 
a < 6 in TT{Ag) = 0, to deduce a bound E|D7v(W^Ar)| < CN"^^"^, and hence 
Kolmogorov approximation also at rate N"^^"^. Total variation approxima- 
tion is of course never good, since C{Wn) gives probability 1 to a discrete 
lattice, and tti is absolutely continuous with respect to Lebesgue measure. 

Example [41.3. (Borovkov-Pfeifer approximation) Borovkov & Pfeifer (1996) 
suggested using a single n-independent infinite convolution of simple signed 
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measures as a correction to the Poisson approximation to the distribution 
of a sum of independent indicator random variables. Their approximation 
is particularly effective in the case that they treated, the number of records 
in n i.i.d. trials. Here, the approximation is not as complicated as it might 
seem, because the generating function of the correcting measure can be con- 
veniently expressed in terms of gamma functions. Its accuracy is then of 
order 0(n~^), which is way better than the 0(1/ log n) error in the stan- 
dard Poisson approximation. Their approach was extended to the multivari- 
ate case of independent summands in Cekanavicius (2002) and Roos (2003). 
Note also that Roos (2003) obtained asymptotically sharp constants in the 
univariate case. In this example, by treating their approximating measure as 
a perturbation of the Poisson, as in Example [3l3, we investigate Borovkov- 
Pfeifer approximation to the distribution of the sum of dependent Bernoulli 
random variables. 

Let /j, i > 1, be dependent Bernoulli Be {pi) random variables. Define 
W = X^iLi ~ W " and let W^'^^ be a random variable having 

the conditional distribution of W^^^ given /j = 1; that is, for all k G 
P(iyW = k)= P(VrW = k\Ii = l). Let 

i=l 1=1 ^ 

The Borovkov-Pfeifer approximation is defined to be the convolution of the 
Poisson distribution Po (A) and the signed measure BP determined by its 
generating function: 

oo 

BP(^) = l[{{l+P^iz-l))exp{-p,{z-l)}y (4.4) 

i=l 

Using the fact that 

e-Pi'-^\l + p(z - 1)) = exp {\n{l + pz / q) -\n{l+p/ q) -p{z-l)} 

= exp{^(.-l) + g(^(£)'(.'-l)|, (4.5) 

where q = 1—p, one can see that BP is a signed compound Poisson measure, 
provided that J2^=iP1 < Note that J2iliPi = oo is allowed, as is indeed 
the case for record values, when pi = 
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Theorem 4.1 Assume that pi < 1/3, i>l, that J2i>iPl < '^^^ ^^^^ 

^ _ i^k ^ Ekiiii^M:! < 1. (4.6) 

mi A 2 ^ ^ 

r/ien 

d„,(£(»y).Po(A).BP) < + (4.7) 
cip,(£(H'),Po(A)*BP) 

^ A(r3^(T'(«'^^£.(r^ + "')- 

lip; / oo 2 \ 

a.(£W.Po,A,*BP) < (4.9) 



Remark. Let z > 1, be independent. Then it suffices to prove the corre- 
sponding approximation for the sum Wg := Yl^=s^i only. Indeed, let BP^ be 
specified by the generating function: 

oo 
i=s 

Then 

Po (A) * BP = £ * Po (^^') * 

and 

and so, by the properties of total variation we have 

drv {W) , Po (A) * BP) < drv [c, {W^) , Po {^P^ * ^P. 
If W is the sum of independent Bernoulli variables, then r/i = and 

supP(Vr = A;) < [A^Pi{l-Pi) 



1=1 
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see Barbour & Jensen (1989, Lemma 1). Now, if we consider the records 
example of Borovkov & Pfeifer (1996), with pi = we can take any s > 4 
in the remark above, and obtain orders of accuracy for the total variation dis- 
tance, point metric and Wasserstein metric of 0{{n\nn)~^), 0(n~^(lnn)~^/^) 
and 0(?T,~^(lnn)~^/^), respectively. 

Proof of Theorem 14.11 In this case, X = Xq = 7,^. Using 04.51) . and 
setting qi = 1 —pi, we can write Po (A) * BP as the signed compound Poisson 
measure with generating function 




(4.10) 



where A; = Xu + A2Z, with 




/ > 2. 



Here, the components Ai; come from the signed compound Poisson represen- 
tation of a sum of independent Bernoulli Be (pi) random variables, 1 < i < n, 
and the X21 from the remaining BP^+i measure. 

Let fii = Xi/X. Then, since Yl'^ilXu = YJi=iPi = X and ^-^2« = 0, 
we have mi = ^^^^ Ifii = 1. Hence, the formula for 61 follows directly from 

00 00 00 / \ I 00 

1=2 1=2 i=l i=l 

Next, we take Stein operators Ao as in (13.61) and Ai as in (13. 5p . For 
g = g'j '■= Aq^Pq/, it follows that 

{00 "\ 00 

E IXu E g{W + /) - E{Wg{W)} \ + J] IX21 E g{W + /). 
1=1 J /=i 

(4.11) 



28 



We begin by bounding the quantity in braces, which gives a bound for the 
accuracy of the approximation of jC.{W) by the distribution of a sum of inde- 
pendent Bernoulh Be (pi) random variables. We observe immediately that, 
for any i and /, 

Eg{W + 1) = piEgiW^"^ + / + !) + giE{(7(W^» + /) | J. = 0} 

and that 

E(7(iy« + /) = piEgiW^'^ + /) + qiEigiW^"^ + l)\Ii = 0}, 

from which it follows that 

Eg{W + 1) = g,Er?(iy» + /) + p,Eg{W^'^ + / + !)+ p,uu, 

where we write := Eg{W^'^+l)-Eg{W^'^+l). Setting := 

so that IXii = Y17=i '^ih and observing that piVu = —qiV^-^-i, we thus have 

Y,^^i^9iW + 1) - E{Iig{W)} 
i>i 

= q^Y, vuEgiW^"^ +l)-q^Y. ^«^%(^^'^ + 

l>l l>2 

+ p,J2^u^ii-Pi^9(W^'^ + i) 
i>i 

i>i 

Adding over 1 < i < n, we thus find that 

oo 

Y^l^ii^giW + 1) -E{Wg{W)} 
1=1 

n 

i=l 1>1 

It now remains to estimate the remaining element Yl'i^i ^-^2i Eg{W + /) 
in fl4.1ip . Using the identity 

i-i 

g{W + l) = g{W+l) + J2^9iW + s), 

s=l 
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we have 

oo l—l 



oo /—I 

J2lX,iJ2^{Ag{W + s)}, 



1=1 K. 1=1 J 1=1 s=l 

oo /—I 



1=1 s=l 

because J2'ili ^-^sz = 0. Now we have 

\E{Ag{W + s)}\ < min{||A(7||oo, \\Ag\\, m^xF{W = k)^ 



pi. 



1=2 i=n+l ^ 



\2- 



The estimates (14. 71) - (14.91) thus follow directly from the inequalities fl3.8p . 
f iXTU]) and fIXTT]) in Example SB. □ 

5 Appendix 

Here, we collect various properties of the solution y to the equation 

y'{x) — (1 — ip)xy{x) = h{x) — h^, x G M, (5.1) 
for given /i and < ^ < 1, where h^, = Eh{N), for ~ 7V(0, (1 - 
Proposition 5.1 
(a) If h = l(-oo,z] for any 2 G M, then 



1 / 27r 

(i) llylloo < 



4 V 1 - V^' 

(ii) lly'lloo < 1; 

(iii) sup|xj/(a:)| < 



l-ijj 
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(b) If h is bounded, then 



, 27T 

W Iblloo < 



(ii) Ib'lloo < 4||/i||oo; 

2 

(iii) sup|xy(x)| < 



1 - ^ 

(c) // h is uniformly Lipschitz, then 

2 

(i) Iblloo < ll^'lloo; 

4 

(ii) Ib'lloo < ^== ||/i'||oo; 

2 

(iii) l|y"l|oo < , \\h'\\oc] 

3 

(iv) sup\xy'{x)\ < -||/i'||oo- 

X 1 -'ip 

Proof. Equation (15. ip can be transformed, using the substitution x = 
w/y/1 — ip, into the equation with = for the standard normal distri- 
bution, for which the corresponding bounds are mostly given in Chen & 
Shao (2005, Lemmas 2.2 and 2.3). In particular, the bounds (a)(i)-(iii) fol- 
low directly from their Equations (2.9), (2.8) and (2.7), respectively; the 
bounds (b)(i)-(ii) from the proofs of their Equations (2.11) and (2.12); and 
the bounds (c)(i)-(iii) from their Equations (2.11)-(2.13). 

The bound (b)(iii) is easily deduced from the explicit expression for the 
solution y: for instance, for x > 0, we have 



oo 



xy{x) = -xe(i-'/')-V2 / e-^'-"^^' /\h{t)-h^)dt. 

J X 

immediately giving 

POO 

\xy{x)\ < X e-^^-'^^'''\h{x + z)-h^\dz, 
Jo 

from which (b)(iii) follows. 
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For (c)(iv), we argue only for x < 0, since the proof for x > is entirely 
similar. Noting that 

y"{x) - (1 - ilj)xy'{x) = (1 - ^)yix) + h'{x) , 

we obtain 

y'^x) = f {{l-ij)y{t) + h\t)}e-'^dt- 

J —CO 

hence 

(l-ll>)x^ (l-!/))t^ 

\xy{x)\ < {{1 -ij)\\y\\oo + \\h\\oo}\x\e 2 e ^ dt 

J —00 

< Iblloo + ||/i'||oo/(l - ^) • 

But now, from (c)(i) above, \\y\\oo < WWoo- ^ 
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